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An editorial comment . . . 



Critical Problems in Mathematics Education: 
The Search for the Holy Grail 



Thomas P. Carpenter 
University of Wisconsin 

At the turn of the century* Hilbert posed 23 problems whose solution 
would lead to fundamental advances in mathematics. It has been proposed 
that this exercise serve as model for identifying the critical problems 
in mathematics education. Richard Shumway asked the authors of chapters in 
Research in Mathematics Education (Shumway, 1980) to follow Hilbert f s example 
and identify a small number of significant problems based on the research 
reviewed. Recently, David Wheeler invited a number of colleagues in math- 
ematics education to go through the same exercise so that he might arrive 
at a synthesis of critical problems that would give direction to research 
in mathematics education much as Hilbert 1 s 23 problems did for mathematics, 

A case can be made that the failure of educational research to provide 
definitive answers to serious educational problems have not been clearly 
articulated, Piatt (1964) has argued that the areas of science in which the 
most dramatic successes have occurred are those in which the practitioners 
have invested a substantial effort in identifying and analyzing the critical 
problems. It is not clear, however, that educational problems can be subjected 
to the same level of analysis or be as clearly solved as problems in mathe- 
matics, microbiology, or high energy physics. Cronbach (1975) argues that 
conclusions in social service are generally not absolute. He proposes that 
"we cannot store up generalizations and constructs for ultimate assembly 
into a network" (p. 123). In other words, even if fundamental problems in 
mathematics education could be identified, it is not apparent that they 
could be clearly solved. 

In the last 10 to 15 years, a number of research areas promised to 
provide answers to fundamental questions in mathematics education, but what 
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specific changes in mathematics instruction have been based on the extensive 
body of Piagetian research or research on discovery learning or aptitude- 
treatment interactions? 

I believe that research is unlikely to provide definitive answers 
to broad fundamental educational questions. I think that the most progress 
will be made if we are more modest in our goals, our research is more 
clearly focused, and our conclusions are more carefully qualified. I 
am suggesting that research be directed at developing what Shulman (1974) 
calls middle-range theories. These theories fall between the task specific 
working hypotheses that are generated to explain individual behaviors and 
the comprehensive theories that attempt to encompass all of instruction in 
mathematics. 

For the most part, I believe that attempts to draw all-encompassing 
conclusions from educational research at best have not been terribly pro- 
ductive and at worst have been misleading. For example, I think that the 
claims for academic learning time and direct instruction must be highly 
qualified if one acknowledges that the goals of instruction include 
understanding and problem solving. Broader conclusions based on research 
in this area could potentially lead to many inappropriate decisions about 
effective teaching. On the other hand, although I am generally sympathetic 
to the finding that teaching for understanding facilitates retention and 
transfer, I think that this conclusion is so broad that it has had relatively 
little impact on instruction in mathematics. 

The kind of direction that I am suggesting is illustrated by a discussion 
at a recent conference on concept learning. In one of the working groups, 
the thesis was put forth that a certain sequence of positive and negative 
examples was most effective in teaching mathematics concepts. Alan Schoenfeld 
proceeded to identify a number of concepts that everyone present agreed 
would be most effectively taught using only positive examples. In fact, 
for every sequence of positive and/or negative examples the group could 
come up with, he was able to find a concept for which that sequence would 
be most effective. The point he was making is that conclusions about 
concept learning in general are not appropriate. The most effective way 
to teach a particular concept depends on the concept. 
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Research is beginning to provide a picture of how specific mathematics 
concepts are acquired and is beginning to provide an understanding of 
the instructional process in particular contexts (Romberg & Carpenter, 
in press). But much of this research is descriptive, and it is not 
clear that it can readily be captured in 23 critical problems* This 
does not mean that careful analysis is not necessary. A great deal 
of sloppy thinking is excused on the grounds that it is necessary to 
be flexible in clinical or ethnographic research. The clearest 
insights have come when the research was guided by some theory, and 
it was possible to put structure on the results. Thus, I do believe 
that it is necessary to identify the critical problems within a specific 
domain, but I am not sure that these critical problems will encompass 
all of mathematics education. 

I would like to end this editorial with a disclaimer. There 
was nothing in either Shumway's or Wheeler's requests for critical 
problems to preclude the kinds of limits on problems that I have proposed. 
The straw person that I have attempted to knock down is my own creation, 
not theirs. Furthermore, I do not intend to disparage the search for 
larger questions. I do not believe that it is either naive or a 
waste of time. Like the knights of the round table searching for the 
holy grail, the search itself can prove instructive. The larger 
questions are worth asking; I'm just not sure we are going to find 
definitive answers that will significantly influence instruction. 
The questions themselves, however, may provide direction to the more 
clearly focused research, but we must be sure they do not limit it. 
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Eattista, Michael T. ; Wheatley, Grayson H. ; and Talsma, Gary. THE 
IMPORTANCE OF SPATIAL VISUALIZATION AND COGNITIVE DEVELOPMENT FOR 
GEOMETRY LEARNING IN PRESERVICE ELEMENTARY TEACHERS. Journal for 
Research in Mathematics Education 13: 332-340; November 1982. 



Abstract and comments prepared for I.M.E. by GLENDA LAPP AN, Michigan 
State University. 

1. Purpose 

The primary focus of the study was to examine the effect of spatial 
ability and cognitive development on learning mathematics; in particular, 
on learning geometry. The effect of instruction in geometry on spatial 
ability was also studied. 

2. Rationale 

Correlational studies have long shown a positive relationship between 

spatial ability and achievement in mathematics. Although many studies 

in the last six to eight years have investigated this relationship in 

an attempt to understand the nature of the interaction of spatial ability 

and learning mathematics, the results leave much unexplained. There has 

also been considerable discussion of the importance of cognitive development. 

Since students in the concrete operational stage rely heavily on concrete 

and pictorial representations which have spatial components, 

there is reason to believe that investigating the 
interaction of spatial ability and cognitive development 
will shed some light on the roles that both factors 
play in mathematics learning, (p. 333) 

3. Research Design and Procedures 

The subjects of the study were 82 college students, mostly females, 
enrolled in four sections of a one-semester college course in geometry 
for preservice elementary teachers (PSET). The independent variables 
were spatial ability and cognitive development; the dependent variable 
was achievement in geometry. To measure spatial ability, the Purdue 
Spatial Visualization Test was given to the students at the beginning 
and at the end of the semester. The cognitive development of the PSET 
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was measured at the end of the semester by a modified version of the 
Longeot Test of cognitive development. Thirty-one of the 82 subjects 
received a score of 12 or higher on the 15-item test and were classified 
as formal operational. Achievement in geometry was measured by the total 
of the students' scores on three tests given during the semester. 

The geometry course was activity-oriented. The students were 
involved with many investigations and materials that had spatial components. 

4. Findings 

The students 1 pre and post spatial visualization scores and cognitive 
score were each significantly correlated with the course grade (p < .001). 
The spatial score correlations with the cognitive score were also 
significant (pre p < .01, post p < .001) as were the pre spatial scores 
with the post spatial scores (p < .001). 

In the regression analysis of course grade, the primary predictor 
was the cognitive score (p < .001) with the pre spatial score, accounting 
for an additional 6% of the variance (p < .01). 

The subjects with median scores on either the cognitive or pre 
spatial test were excluded and the remaining 59 subjects with high or 
low scores were anlyzed in a two-way analysis of variance. The main 
effect due to the spatial level and the interaction were not significant. 

The posttest scores of spatial visualization were significantly 
higher than the pretest scores of spatial visualization (p < .001). 

5. Interpretation 

The authors state that their findings give a "strong indication 
that cognitive development is a better predictor of a geometry course 
grade than spatial ability" (p. 338), but both are important in learning 
geometry. The authors hypothesize that the analytic nature of many of the 
spatial items on course tests may have reduced the correlation between 
course grade and spatial scores. 

Some Purdue Spatial Visualization test-retest data on 36 teachers 
not included in this study lend support to the claim that geometry activities 
such as those provided in the course contributed to the significant 
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difference found in spatial visualization scores. The gain for the 
students in this study was significantly higher than the test-retest gain 
for the "control 11 teachers (p < .05). 

The increase in spatial scores for students above and below the 
median was compared. The average gain for students below the median 
was 3*95; for those above the median, the average gain was .92. This 
suggests that further research is needed to clarify whether or not instruc- 
tion such as the activity-based geometry instruction in this study helps 
improve spatial ability and if so, whether instruction benefits one group 
of students more than another. 



The authors are to be commended for adding an interesting twist to 
the question of the role of spatial ability in the learning of mathematics. 
They focused their attention on the learning of geometry, a critical but 
sadly neglected component of the K-12 curriculum, and raised the question 
of the relativ e importance of spatial ability and cognitive development 
in predicting achievement in geometry. 

It is, of course, important to study preservice teachers, since they 
ultimately become critical factors in the education of children. However, 
we must be careful not to infer that this study tells us anything about 
children, or, for that matter, anything about college students (males or 
females) in general. 

The authors give a great deal of information on both the Purdue 
Spatial Visualization Test and the Sheehan version of the Longeot Test 
of cognitive development to justify their choice of these two instruments. 
However, so little information is given about the three tests used as 
a course grade score, that one is not even sure that the same tests were 
given to all four sections of students. I am left wondering who designed 
the tests? What format was used for questions? If the items were not 
multiple choice, how was a protocol for scoring established? What was 
the reliability of each course exam? Since the mean for the group was over 
80% on the course exams, were the exams discriminating enough to provide 
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useful measures? Were each of the four sections taught by a different 
instructor? Was the weekly discussion of course content sufficient to 
assure that the treatment was fairly standard from class to class? Were 
there any differences among the four classes on any of the measures? 

The spatial visualization test was given as both a pre- and post- 
test. However, the cognitive development test was given as a post test. 
How can we be sure that the study of geometry itself did not affect the 
cognitive development of the students? Since the study was interested 
in how well cognitive development predicted success in geometry, would it 
not have been better to test the students' cognitive development prior to 
the treatment? 

On the question of whether or not spatial ability can be improved 
by training, this study offers support for the effectiveness of instruction 
of an activity-based sort. This is an important result for mathematics 
education. We could place greater confidence in this result if a more 
careful look at the test-retest scores for other preservice elementary 
teachers without spatial training could be given. For example, Pre- 
Experimental design 3 from Campbell and Stanley (1963) would be appropriate: 



Here X is the treatment of taking the Purdue Spatial Visualization test 
the first time; Oj, the scores on the same group the second time; 0 2> 
the scores on the Purdue Spatial Visualization Test for an equivalent 
group of students. The authors acknowledge this problem and do report 
comparison data between gains for two groups, the students that received 
the instruction and another group that did not. 

Reference 
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Beckerman, Terrill and Good, Thomas. THE CLASSROOM RATIO OF HIGH AND 
LOW-APTITUDE STU3ENTS AND THE EFFECT ON ACHIEVEMENT. American Educa- 
tion?! Research Journal 18: 317-327; Fall 1981. 



Abstract and comments prepared for I.M.E. by RICHARD CROUSE, 
University of Delaware. 

1 . P urpose 

The ratio of high-aptitude students to low-aptitude students in 
third- and fourth-grade classrooms would influence the mathematics 
achievement of these students. 

2 . Rationale 

Recent research has demonstrated that the classroom process can 
be altered in ways that improve student achievement. However, in 
comparison to the growing literature on instructional process, there 
is very little information on how the types of students present in 
classrooms influence instructional process or outcomes. Also, much 
of the research in this area has used the school rather than the 
classroom as the unit of analysis for testing the student charccter- 
istic ratio/achievement hypothesis. This investigation was thus 
conducted using the classroom as the unit of analysis, since this 
analysis has more potential for explaining student progress than re- 
search analyzed at the school level if the variable of interest func- 
tions at the classroom level. 

3 . Research Design and Procedures 

The sample for the investigation was 103 third- and fourth-grade 
classrooms drawn from a large metropolitan school district that basically 
served a middle-class population. Pre- and post-mathematics achieve- 
ment data and aptitude scores were available for these students. 

Within grade level, students were assigned to high, middle, or 
low aptitude groups on the basis of their scores on the Cognitive 
Ability Test. If classrooms did not have at least four students each 
of high, middle, and low aptitude, they were dropped from the analysis. 
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This criterion reduced the number of classrooms from 103 to 81 • 

Two types of classrooms were then operationally defined. More 
favorable classrooms were those in which low-aptitude students were 
less than a third of the classroom population and high-aptitude students 
were more than a third. Less favorable classrooms were defined as 
those in which low-aptitude students were more than a third of the class 
and high-aptitude students were less than a third. Fifty-five class- 
rooms were classified as more or less favorable classrooms. This in- 
cluded 27 third-grade classrooms with 14 more favorable and 13 less 
favorable, and 28 fourth-grade classrooms with 17 more favorable and 
11 less favorable. 

The dependent measure used in the study was students' total mathe- 
matics scores on the Iowa Test of Basic Skills. Residual gain scores 
were computed for students by using each studen; 's score on the pretest 
as a covariate. Before conducting an analysis of variance, it was 
ascertained that levels of student aptitude were comparable across 
classrooms. 

4 . Findings 

Both low- and high-aptitude students in more favorable classrooms 
had higher achievement scores than comparable groups in less favorable 
classrooms (p < .05), 

5 . Iterpretations 

The investigators concluded from their findings that: 

(1) The possible effects due to the ratio of high-to-low aptitude 
students in a classroom are not accounted for by the usual statistical 
procedures used in process-product or teacher effectiveness research, 

(2) Teacher effectiveness research may be confounded by the aptitude 
ratios in a classroom or by other uncontrolled classroom context effects. 

(3) Classroom mean gain scores may be the result of the interaction 
of aptitude ratio and instructional variables. 

(4) There are various explanations as to why a more favorable or 
less favorable environment influences student achievement. One expla- 
nation might be that the demands of the teacher might be different 
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depending upon the various aptitude ratios. Thus, the pace could be 
slower in less favorable classrooms, or perhaps more time is spent on 
management problems. 

(5) "Having a greater ratio of high-to-low aptitude students in a 
class does not automatically create a more favorable environment for 
low achievers, but it may increase the chance for such students to 
receive appropriate instruction." 

Abstractor's Comments 

This is an interesting study which attempts to attack an important 
problem in teacher effectiveness research. However, some information 
was not included which would have helped in the reading of this study. 

Among the questions which arise in connection with the reporting 
of this study are: 

(1) What was the duration of the experiment? 

(2) Table III in the report gives means for low- and high-aptitude 
groups — but means of what? 

(3) Since the study was of intact classrooms, was randomness 
achieved? 

In spite of these criticisms, this is an interesting study which 
has significance for the teaching of mathematics. As the investigators 
suggest, it would be important to further test the aptitude-ratio con- 
text effect to see if the findings are generalizable across grade levels 
and/or subject matter. Additional studies would also be needed to 
determine which ratio or ratios, if any, are most beneficial to students 
at various grade levels. 
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Bergan, John R. ; Towstopiat, Olga; Cancelli, Anthony A.; and Karp, Cheryl, 
REPLACEMENT AND COMPONENT RULES IN HIERARCHICALLY ORDERED MATHEMATICS 
RULE LEARNING TASKS. Journal of Educational Psychology 74: 39-50; 
February 1980. 

Abstract and comments prepared for I.M.E. by IPKE WACHSMUTH, Northern 
Illinois University. 

1. Purpose 

"The present study examined ordered and equivalence relations and 
performance errors in a set of hierarchically arranged fraction identi- 
fication tasks to determine the extent to which observed relations 
and errors were congruent with the component-rule and rule-replacement 
hypotheses" (pp. 41-42). 

2. Rationale 

The investigation was conducted in the context of mathematical 
rule learning. For different rule-learning tasks, hierarchical orderings 
of tasks can be considered in which subordinate tasks are prerequisite 
to super-ordinate tasks. The authors contrast two hypotheses about 
the conditions under which two rule-learning tasks can be expected to 
be in an ordered relation: 

(a) the component rule hypothesis suggested by Gagne, which 
states that two rule-learning tasks are ordered if one 
involves a rule that is a component of the second; 

(b) the rule replacement-hypothesis suggested by the authors, 
which states that two rule-learning tasks are ordered if 

one involves a rule that has to be replaced by a more complex 
rule in order to apply also for the second. 

For both cases, the authors cite studies in the realm of fractions to 
support the existence of rule-learning tasks that are ordered as indicated 
in the hypotheses. 

The two hypotheses lead to different expectations about order and 
equivalence relations between tasks and about probable performance errors. 
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The authors believe that insights about the nature of the transition 
from nonmastery to mastery with respect to a particular rule-learning 
task can be obtained through establishing the validity of one of the 
hypotheses. These insights could have important implications for 
instruction. 

3. Research Design and Procedures 

The two hypotheses were investigated using a set of tasks in which 
a fractional part of a set of n objects was to be identified; for example, 
two-fifths of a set of five. A child's behavior emitted in mastering 
such a task could be conceptualized in two rules (not necessarily being 
verbalized by the child) : 

(i) the "denominator rule" , which states that to identify a 
fraction with denominator r for a set of n objects, the 
set must be partitioned into r equivalent subsets; 

(ii) the "numerator rule" , which states that to identify a fraction 
with numerator s, one of r subsets established by the denom- 
inator rule must be taken s times. 

Since the numerator rule refers to the r subsets established by 
the denominator rule, the denominator rule can be regarded as a component 
of the numerator rule. In this respect, identifying one- fifth of 
a set of five would be hypothesized to be a task subordinate to 
identifying two-fifths a set of five, because the counting involved 
in the numerator rule could be omitted in the first, while it could 
not be omitted without impairing identification in the second case. 

In contrast, the authors illustrate rule replacement by an 
example involving another hypothetic rule: 

(i) 1 the "one-element rule" , which equates each of the r subsets 
mentioned in (i) with just one element in the set of n objects. 

For instance, two-fifths of a set of five could be identified 
correctly in applying the numerator rule in connection with the one- 
element rule (in place of the denominator rule): one of the five 
elements (rather than subsets) established by the one-element rule is 
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taken two times. Yet cases where the number of objects differs from 
the denominator could only be expected to be identified correctly 
if the one-element rule is replaced by the more complex denominator 
rule. In this respect, identifying two-fifths of a set of five would 
be hypothesized to be a task subordinate to identifying two-fifths 
of a set of ten . 

The two hypotheses were systematically explored using the statistical 
technique of latent class models to assess equivalence and order 
relations among a set of eight fraction identification tasks. The 
tasks, presented twice in random order, required identification of 
a fractional subset of a given set of circles. Included were the 
identification of one- third of a set of three, two- thirds of a set 
of three, one-third of a set of six, two-thirds of a set of six, one- 
fifth of a set of five, two-fifths of a set of five, one-fifth of 
a set of 10, and two-fifths of a set of 10. 

A total of 456 middle-class children (213 boys and 243 girls) 
of ages 7-12 from mixed ethnic groups were group-tested in public 
and parochial elementary schools. Sample responses to the questions 
given in testing booklets were demonstrated and understanding of all 
tasks and directions was ensured before and during the testing. 
Time was given as necessary to complete all problems. Each response 
was scored as passing or failing. 

Four latent class models were tested: 

Model H- , representing equivalence of tasks, was composed of three 
latent classes: a nonmastery, a mastery, and a transition class. 
It was assumed that masters would pass all problems, nonmasters would 
fail all problems, and transitionals would have a passing probability 

equal across items. 

Model H 2 , representing equivalence of tasks, included all classes 
of H x plus four classes representing inconsistency of responses across 
different items. That is, pairs of tasks hypothesized to be equivalent 
were expected to be responded to inconsistently by transitional 
individuals . 

Model H 3 , representing ordering of tasks, included all classes of 
H 2 plus one class representing individuals who were masters of the 

10 
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subordinate task and nonmasters of the super-ordinate task with respect 
to two tasks hypothesized to be ordered. 

Model H^, representing ordering of tasks, involved an asymmetrical 
transition between nonmastery and mastery in that no latent classes 
were included to represent the case in which performance on a super- 
ordinate task was superior to performance on a subordinate task. 

4. Findings 

For each pair of tasks, it was determined which of the four 
models best fit the data. In no case was preferred. was 
preferred for all task sets involving denominator variations and for 
one task set that varied numerators. was preferred for two of the 
four task sets involving numerator variations. was preferred 
for one task set involving numerator variations and for all task 
sets involving variations in the number of subset elements. For all 
performance errors, 56% were consistent with the one-element rule. 

Except for four cases, the preferred models were characterized 
by non-significant chi-square values. The four significant cases 
included the pairs one-fifth of ten vs. two-fifths of ten, two- thirds 
of six vs. two-fifths of ten, one-third of three vs. one-third of six, 
and two- thirds of three vs. two- thirds of six. 

5. Interpretation 

The findings for items involving denominator variations support 
the hypothesis that two tasks which involve a common rule will be 
equivalent with respect to mastery of that rule. A partial inconsistency 
in responding is part of the transition process from nonmastery to mastery. 

The findings for items involving numerator variations, with one 
exception, support the component rule hypothesis. The exception 
suggests that in one case the acquisition of numerator rule and 
denominator (or one-element) rule occurred concurrently. In two cases 
of an established ordering of tasks, it occurred that nonmasters of a 
sub ordinate task were in transition with respect to the super-ordinate 
task. 
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The "most important finding in this study" (p. 49) was the preference 
for in all cases varying the number of subset elements * This finding 
is consistent with the rule-replacement hypothesis and supports the 
view that many children used two qualitatively distinct rules (one- 
element and denominator rules) in handling fraction denominators. 
The authors relate this finding to cognitive developmental theory 
and suggest that, as in the course of broad-scale development , qualitative 
changes may occur in children's thinking for specific learning tasks. 

The following implications for instruction are formulated: The 
fact that advancement from nonmastery to mastery involves a transitional 
state suggests differential instruction with respect to the state of the 
learner. The fact that when varying numerators many nonmasters of a 
subordinate task were transitional with respect to a super-ordinate 
task suggests that both tasks may be learned at the same time. The 
rule- replacement hypothesis could be relevant for analyzing cognitive 
changes and diagnosing problems in children's learning through determina- 
tion of the rules they use in task performance. The insights gained 
could provide a basis for instructional sequencing. 

The authors raise the following research questions : 

• to investigate the origin of rules used by students; 

• to investigate factors that affect sequential and simultaneous 
rule acquisition; 

• to investigate whether rule-replacement occurs in many areas 
of learning; 

• to investigate to what an extent rule-replacement can be 
affected by instruction. 



Marking one-third of three circles should be as easy or difficult 
as marking one-fifth of five circles, marking two-fifths of five 
should be more difficult than marking one-fifth of five, and marking 
one-fifth of ten more difficult than marking one-fifth of five — 



Abstractor's Comments 
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these were the key ideas in this very thoughtfully designed study. 
The aim was to obtain insights into the rules that govern children's 
thinking in performing such tasks, which could explain why one task 
would be as easy as, or more difficult than, another. 

The two hypotheses formulated were hypotheses about rules that 
are hypothetic themselves. This makes the issue complicated since 
hypothetic rules are unobservable; they can only be observed implicitly 
in the results of their application. Latent class models were used 
to attack the problem of representing and testing hypotheses stated 
in terms of unobserved ("latent") variables as given with the rules. 

The results, in their general trend, support both hypotheses. 
Of particular interest seem the findings formulated about transition 
from nonmastery to mastery of a rule-learning task. From the written- 
test data, insights could not be obtained that give an explanation for 
the observed partial inconsistencies. Clinical interviews with a 
smaller sample might provide additional information. 

It should not be overlooked that the findings were not always as 
clearcut as one might have hoped with this promising approach. Four 
of the 12 task sets tested between model and data set. For one of 
these, one-third of three and one-third of six, this was due to the 
fact that the number of response patterns where both of the super- 
ordinate task items were passed and both of the subordinate items 
were failed was unexpectedly large under the preferred model, H^. The 
preference for which modelled an asymmetrical transition not 
accounting for such response patterns was not rejected since the tested 
improvement of over was not significant. 

In addition, I don't feel comfortable when a result is established 
through complicated modelling that, based on a p-value of .05, states the 
"occurrence of an equivalence relation" for the task set requiring 
identification of one-fifth and two-fifths of 10, which "simply suggests 
that the acquisition of the numerator rule and the acquisition of the 
denominator rule or one-element rule were concurrent for this task set," 
and then calls for research "to investigate factors affecting sequential 
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and simultaneous rule acquisition" (p. 48; the analysis of standardized 

residuals reveals that an unexpectedly large number of inconsistent 

responses to identical items was responsible for the observed discrepancy). 

The impact of the approach taken in the study is not affected, however. 

With rule replacement, another important mechanism of learning has been 

identified that certainly can be found in many other areas of mathematics 

learning up to the calculus level (for example, compare the tasks of finding 

2 

the derivative of sin x and sin (x ); non-masters of the second task 

2 

frequently come up with cos (x )). The research questions raised in the 
context of rule replacement appear very worth considering. The origin 
of rules that have not been taught, as in the case of the one-element 
rule, presumably has to do with "minimal discrimination": for a restricted 
class of tasks this simpler way of attacking a problem may have proven 
successful and thus was adopted by the learner as a rule that is not 
abandoned as long as no failure is realized. Confronting the learner with 
instructional situations that require finer discrimination to be made 
might be a possible way of affecting rule replacement. 
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Dreyfus, Tommy and Eisenberg, Theodore. INTUITIVE FUNCTIONAL CONCEPTS: 
A BASELINE STUDY ON INTUITIONS. Journal for Research in Mathematics 
Education 13: 360-389; November 1982. 



Abstract and comments prepared for I.M.E. by JOHN HUBER, 
Pan American University. 

1 . P urpose 

The purpose of this study was to assess the intuitive background of 
junior high school students as they develop the concept of function. In 
particular, the following hypotheses were tested: 

1. Intuitions on functional concepts grow with the pupils' progress 
through the grades. 

2. Intuitions are independent of sex. 

3. Intuitions of high-level students are more often correct than 
those of low-level students. 

4. Intuitions are more often correct in concrete situations than 



2. Rationale 

For the purpose of this study the term intuitions is taken to refer 
to "mental representations of facts that appear self-evident 11 (p. 360). 
The authors make the assumption that the intuitive meaning of a mathe- 
matical idea is essential in order to develop the mathematical reason- 
ing process. In addition, they feel that (1) intuitions can be trained 
through appropriate activities, (2) a primary goal of education is to 
enlarge the base of intuitions, and (3) the teaching process should be 
based on the intuitive knowledge of the learner. 

Based on its unifying nature and its frequent appearance through- 
out the mathematics curriculum, the function concept is one of the most 
central ideas in mathematics today. It is this high level of abstraction 
and generalizability that make the function concept quite complex. 
First, it is not a single concept. It has a number of subconcepts asso- 
ciated with it (e.g., domain, range, image of an element, etc.). These 
are called "functional concepts" (p. 361) in this study. Second, the 
same function may have various representations called "settings" (p. 361) 



in abstract ones. 
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(e.g., number of variables, finite domain, finite range, implicit defini- 
tion, explicit definition, recursive definition, etc.). 

Each of these aspects is a major contributor to the difficulties 
students encounter in learning the concept of function. Based on these 
three aspects of function, the authors present a three-dimensional 
"function block" (p. 364) structure in which the first dimension repre- 
sents the various settings, the second dimension the various functional 
concepts, and the third dimension the levels of abstraction and general- 
ization. Based on the first two dimensions of this structure, the four 
hypotheses were formulated. 

3 . Research Design and Procedures 

The dependent variable, AVG, was the percent of correct items on 
a questionnaire designed by the authors. The items measured various 
concepts about abstract and concrete functions in three settings (dia- 
grams, graphs, and tables.). 

External validity for the questionnaire was provided by a panel of 
five high school and college mathematics teachers classifying a pool of 
items according to the concepts concerned. Reliability coefficients 
were estimated using the KR-20 formula for the full test and the con- 
crete and abstract subtests. The reliability estimates were 0.91 (full 
test), 0.86 (concrete subtest), and 0.81 (abstract subtest). 

The four independent variables were grade in school (Grade), 
ability-social-level (Absolv) , Setting (diagram, graph, or table), and 
Sex. The contruct variable Absolv is a combination of ability level 
and the extent to which the learning environment was socially disadvan- 
taged (pp. 367-369). 

At the beginning of the school year before any classes had studied 
the concept of function, the questionnaires were administered to students 
in grades 6, 7, 8, and 9 in Israel. Only those students completing 90% 
of the questionnaire (a total of 443) were included in the sample. 

4. Findings 

Using a four-way analysis of variance, AVG by Grade X Absolv X Set- 
ting X Sex, 51% of the variance in total test scores was accounted for 
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by the model. The independent variables Grade, Absolv, and Setting were 
found to be statistically significant (a ■ .05). The significant inter- 
actions were Grade X Absolv, Absol X Sex, and Grade X Absolv X Sex. 

5 . Interpretations 

An overall increase in performance through the grades was observed, 
with a significant decrease occurring from Grade 7 to 8. The main prog- 
ress came in Grade 6 for high-Absolv and in Grade 8 for low-Absolv stu- 
dents. This supports the general cognitive performance theory of Lewy 
and Chen that socially disadvantaged students can learn the material, 
but it takes them longer to do so (p. 372). 

Comparing the performance in the three settings, it was found that 
at all grade levels the diagram setting presented more difficulty than 
the other two. This was attributed to the complexity of the diagrams 
and poor reproduction of several questions. Comparing performance in 
the graph and table settings with respect to grade, no preference was 
found. However, in comparing the two settings with respect to Absolv, 
high-Absolv students preferred a graph setting while low-Absolv students 
preferred the table setting. This suggests that subconcepts should be 
introduced in a graph setting for high-ability students and in a table 
setting for low-ability students. 

No differences in overall performance with respect to Sex were 
found. However, an interesting "switching" (p. 375) occurred in the 
Sex S Absolv and the Grade X Absolv S Sex interactions. Low-level males 
performed worse than low-level females, while high-level males performed 
better than high-level females. A similar switching occurred for high 
Absolv in the Grade X Absolv X Sex interaction. Two possible explana- 
tions were given, one based on differences in the rate of physical 
development of males and females and the other based on differences in 
the seriousness of male and female students at this age. 

All factors contributing to the significant differences on the full 
test were significant on the concrete subtest. All factors except Set- 
ting and Grade X Absolv X Sex were significant on the abstract subtest. 
No additional factors appeared. 
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Ab stractor's Comments 

The function concept is certainly one of the most fundamental and 
unifying concepts in mathematics. The authors are to be commended for 
undertaking a study in such an important area of mathematics. More 
studies in this area are needed. 

Several important results need to be examined. The role of intui- 
tions in the learning of mathematics needs to be pursued. Can intuitions 
be taught? Would a teaching process based on intuitions lead to more 
meaningful learning of mathematics? Answers to these questions are 
essential to the long-range implications of this study. 

The "function block" paradigm not only provides a model for the 
study of functional concepts as they relate to the study of the concept 
of function, but will also provide a framework for the study of vertical 
and horizontal transfer. This model could also be applied to studies 
of the attainment of other mathematical concepts such as variable, set, 
and equation. 

The conclusion that high-Absolv students prefer a graph setting 
while low-Absolv students prefer a table setting is very important. What 
type of learning results in each of these settings? Is one instrumental 
and the other relational? Are they different in level or type? These 
questions need answers. 

The concept of variable is certainly essential to the understanding 
of the concept of function. No mention of this critical functional con- 
cept was explored in this study • 

Several weaknesses in the statistical analysis need to be noted. 
Based on the "function block" paradigm, four hypotheses were formulated. 
However, in analyzing the results an ANOVA model with first- and second- 
order interactions was used. The authors should have either provided a 
theoretical base for hypothesizing such interactions or no such inter- 
actions should have been included in the ANOVA model. It is essential 
that the statistical model be the same as the model formulated by the 
hypotheses. 

The first hypothesis formulated was, "Intuitions on functional con- 
cepts grow with pupils 1 progress through the grades" (p. 366). Again, 
the statistical technique was inappropriate. A significant main effect 
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only implies that a trend (linear, quadratic, etc.) is present. A quad- 
ratic trend would not support the hypothesis the authors presented. A 
test of linear trend in grades should have been used. 

In analyzing the four-way ANOVA, several interactions were signif- 
icant. With significant interactions, interpretation of main effects is 
questionable. 

Except for the few weaknesses noted above, this study should form 
an excellent foundation for future studies in the attainment of the con- 
cept of function. In addition, the function block paradigm should form 
an excellent theoretical framework for other studies. 
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Galbraith, P. L. ASPECTS OF PROVING: A CLINICAL INVESTIGATION OF 
PROCESS. Educational Studies in Mathematics 12: 1-28; February 
1981. 



Abstract and comments prepared for I.M.E. by LEWIS R. AIKEN, 
Peperdine University. 



1 . Purpose 

The goal of this investigation was to determine the perceptions 
of secondary school pupils with respect to modes of argument having 
accepted status in mathematics. More specifically, an effort was 
made to attain insight into pupils' understandings of formal explan- 
ation and proof in mathematics and how objects are used in the math- 
ematical world. 



2. Rationale 

The background research and theorizing concerned with the rela- 
tionships of age and experience to mathematical reasoning are sum- 
marized briefly at the beginning of the paper. In particular, Bell's 
(1979) description of the meaning of proof in terms of verification 
or justification, illumination, and systematization is emphasized. 
Bell's research stemming from the proposition that pupils will not 
employ formal proofs until they understand the public status of 
knowledge and the value of public verification is reviewed. Van 
Dormolen's (1977) conception of different levels of functioning in 
logical thinking is also discussed. Van Dormolen's examples are 
related to Van Hiele's three levels of thinking: (1) a ground level, 
in which thinking is limited to a particular example; (2) Level 1, 
in which concepts are more abstract but still limited to the domain 
of discourse; (3) Level 2, in which local organization is understood 
and the person learns to reason about reasoning. 

Relying on the research of Bell and Van Dormolen in particular, 
the present investigator was concerned with identifying discrepancies 
between the thought processes of pupils and accepted mathematical 
reasoning processes regarding specific problems. He was interested in 
determining the extent to which pupils are aware of the need for and 
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and actually use specific techniques and concepts in evaluating proofs 
and explanations. 

3. Research Design and Procedures 

The clinical interview technique was employed to study pupils 1 re- 
sponses to three selected mathematical problem situations: Game of 25, 
Game of 7, and Quadrilaterals. Interviewing the pupil while he or she 
attempted to solve each problem consumed 30-40 minutes time per pupil, 
who was free to use paper and pencil and was asked a series of pre-planned 
questions while he or she attempted to solve the problem. The prompting 
questions continued until the problem was solved or the prompts were 
exhausted. Complete responses to each item were obtained from a min- 
imum of 170 pupils aged 12-17 in the Brisbane (Queensland), Australia 
public schools. The interviews, which were tape-recorded for later 
analysis, were conducted by postgraduate students in education. 

4. Findings 

The 170 response protocols to each of the three items were analyzed 
in terms of the degree of completeness and methods of proof employed. 
Tabulations were made on such variables as complete checks, partial 
checks, etc., but no statistical analyses were reported. Empirical 
findings with respect to variety and completeness of checking, identifi- 
cation and use of principle, chaining of inferences, domain of validity 
of generalization, literal interpretation of statements and conditions, 
distinction between implication and equivalence, the meaning of defini- 
tion, and proof structure were analyzed intuitively and considered at 
length in the paper. 

5. Interpretations 

The "clinical" interpretations of the findings of this investigation 
are in terms of eight components identified from clusters of responses 
given in the three problem situations: variety /completeness in checking; 
proof /explanation related to an- external principle; linking of inferences; 
domain of validity of generalizations; literal interpretations of data; 
evaluating statements/distinguishing implication and equivalence; meaning 
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of definitions; and proof structure. The investigator concluded that 
the response types identified in this investigation indicate that the 
majority of pupils do not have an objective, detached view of problems, 
but are rather restricted and occasionally even employ psycho-emotional 
approaches to problem solution. Correcting these limitations in pupil 
perceptions and thinking will require greater attention in schools to 
the development of high-level thinking across many contexts. 

Abstractor's Comments 
This is an interesting, thought-provoking paper concerning the 
perceptions and logical thinking processes of secondary school pupils 
about mathematical problems. It is a heuristic paper in that, although 
it provides little concrete numerical data and few definitive findings, 
it should serve to generate numerous hypotheses for empirical investi- 
gation. 

The clinical, or phenomenological-intuitive, approach used in this 
investigation has well-known limitations as a scientific method. The 
inherent subjectivity of the approach poses many questions concerning 
objectivity, reliability of testing and interviewing, generality of 
the findings, etc. In addition, insufficient data on the nature of 
the sample are included, and the lack of statistical tests of signif- 
icance are noteworthy (at least to an American psychologist!). However, 
the clinical approach has many adherents and has generated some intri- 
guing results. Furthermore, it is a necessary approach in studying 
such subjective phenomena as the thought processes or mental strategies 
employed by secondary school students in attempting to solve problems. 
Also, in defense of the investigator's methodology, it was not completely 
open-ended: specific, prearranged prompts were given by the inter- 
viewers. And although the reader is not told how many postgraduate 
students served as interviewers and how they were trained, the sample 
of respondents appears to have been sufficiently large. 
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Heller, Kirby A. and Parsons, Jacquelynne Eccles. SEX DIFFERENCES IN 
TEACHERS 1 EVALUATIVE FEEDBACK AND STUDENTS 1 EXPECTANCIES FOR SUCCESS IN 
MATHEMATICS. Child Development 52: 1013-1019; September 1981. 



Abstract and comments prepared for I.M.E. by DALE DROST, 
Memorial University of Newfoundland. 

1 . Purpose 

The purposes of this study were to investigate the existence of sex 
differences in teachers' use of evaluative feedback in junior high school 
mathematics classes and in students' expectancies for success in mathe- 
matics. 

2. Rationale 

Concern was expressed for the relative underparticipation of females 
in high school mathematics courses and the subsequent effects of this on 
future educational and career options. The researchers felt that the 
variables chosen for study might be related to participation in more 
advanced mathematics classes. 

The portion of the study which dealt with the teachers 1 use of eval- 
uative feedback was modeled on the work of Dweck et al. (1978). These 
researchers had identified differences in the type of praise and criti- 
cism received by males and females in fourth- and fifth-grade classrooms. 

With respect to students' expectancies for success, research was 
cited supporting the position that performance is related to expectancy 
and that lower expectancies are often found with females than with males. 
The junior high school years were chosen for study based on the claim 
that these are the years when sex differences in attitudes towards and 
achievement in mathematics begin to emerge. 

3. Research Design and Procedures 

Data were collected by three methods: a classroom observational 
system, a student questionnaire, and a teacher questionnaire. 

Five observers undertook approximately three weeks' training on a 
modified version of Brophy and Good's Teacher-Child Dyadic Interaction 
System (1970) and Dweck's observational procedures (Dweck et al., 1978). 
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Instances of teachers 1 use of praise and criticism were observed and re- 
corded as they related to each student's quality of work, form of work, 
and conduct. The observers also coded teachers 1 explicit use of causal 
attributional statements into one of the four categories of task diffi- 
culty, effort, ability, or incorrect use of a mathematical operation. 
Finally, each explicit expectancy statement made by the teacher with 
respect to a child's performance was coded on a four-point scale ranging 
from most positive to least positive. The mean percentage of agreement 
for each observer with criterion coders was greater than 76% in all 
cases, with over 70% agreement being attained for individual categories. 

The student questionnaire consisted of six questions about students' 
expectancies for success in mathematics, the questions being divided 
into those examining expectancy for success on a familiar task (i.e., 
current task), and those examining expectancy for success on a less- 
familiar task (i.e., future task). A seven-point rating scale was used, 
ranging from "not at all well" to "very well". Cronbach Alpha coeffi- 
cients ranged from 0.77 to 0.85 and the correlation between the two 
scales was 0.62. 

The teacher questionnaire contained two items. Teachers were asked 
to rank each student's position in class in terms of quintiles and also 
to indicate on a seven-point scale, ranging from "very poorly" to "very 
well' 1 , the expectancy for each student's performance in a future advanced 
mathematics course. 

The observational system was used in eight seventh-grade and seven 
ninth-grade classrooms in middle to upper-middle class neighborhoods in 
a small northwestern city. Classes were volunteered by their teacher. 
The mathematics curriculum in each class was at grade level or slightly 
advanced. Observations were conducted for 13-15 hours in each class- 
room over a two-month period, with the last 10 hours of observations 
being recorded* 

The student questionnaire was administered in 12 of the above class- 
rooms to students who volunteered and also received parental permission. 
Fifty-nine percent of the total seventh-grade sample and 67% of the 
ninth-grade sample participated. Teacher questionnaires were apparently 
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administered to all participating teachers. The questionnaires were 
administered after the completion of the observations. 
Four sets of analyses were planned: 

1. comparisons of the teachers 1 use of discriminant praise and 
criticism for boys versus girls; 

2. comparisons of the teachers 1 use of praise and criticism for 
students having high versus low teacher expectancies; 

3. comparisons of the teachers 1 causal attributions and expectancy 
statements for boys versus girls; and 

4. comparisons of boys 1 and girls 1 expectations for their own 



4. Findings 

1. Neither of the two main effects, sex or grade, nor the inter- 
action between sex and grade was significant for any of the five vari- 
ables: percentage of praise directed to the quality of work and the 
form of work; percentage of criticism directed to the quality of work, 
to the form of work, and to conduct. Praise directed to conduct was 
deleted from the analysis since it occurred very infrequently. The 
same conclusion was reached when the classroom, treated as a random 
factor nested within grade, was used as the unit of analysis, when the 
individual was used as the unit of analysis, and when sex and teacher 
were used as independent variables. In each case analysis of variance 
procedures were used. 

2. Students were divided into high and low expectancy groups, 
based on the teacher f s expectancies. With respect to praise there were 
no significant differences between grades or between expectancy groups, 
nor was there a significant sex-by-expectancy-group interaction. It 
was implied, although not stated, that there was no significant differ- 
ence between sexes. For criticism, grade level was not significant nor 
was the sex-by-expectancy-group interaction. It was implied, though not 
stated, that there was no difference between expectancy groups. For 
criticism, boys received significantly more criticism in dyadic situa- 
tions than did girls. The results for other interactions were not re- 
ported. For the above, the classroom was the unit of analysis and the 
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dependent variable was the mean score for each sex within each expectancy 
group and classroom. 

3. Teachers made very few attributional or expectancy statements. 
Most attributional statements followed unsuccessful student outcomes and 
only these were coded. Chi square analyses revealed that teachers 1 use 
of attributional and expectancy statements did not vary as a function of 
either sex or teacher expectancy. 

4. With respect to students' expectations for their own success 
the only significant difference reported was that girls had lower ex- 
pectancies of success for future tasks than boys. Neither sex, grade 
level, nor the interaction between sex and grade level were significant 
for the complete expectancy of success scale or for the expectancy of 
success for the current performance subscale. ANOVAs were carried out 
using both the classroom and individual as the unit of analysis with 
similar results. 

5. In terpretations 

The investigators concluded that these findings did not support 
those found previously by Dweck et al. They suggest that possible 
reasons for this conclusion were that Dweck et al. had used only three 
teachers who might not have been representative. Also, the current 
study was conducted in junior high school mathematics classes whereas 
Dweck et al. observed fourth and fifth graders in a variety of subject 
areas. It was suggested that !t teachers f feedback is in part determined 
by the age of the students" (p. 1019). It was also suggested that "sex 
differences in expectancies for mathematics do not emerge with any con- 
sistent regularity until late junior high school 11 and hence the findings 
in the present study were not surprising. Future research was recom- 
mended at a variety of grade levels to attempt to resolve reasons for 
the conflicting results. 

The finding that girls had lower expectancies for future or unfa- 
miliar tasks than did boys was considered to be in support of previous 
research. It was noted that a study examining the relationship between 
expectancies for success and participation in advanced mathematics was 
being undertaken. 



35 



27 



Abstractor* 



s Comments 



Research which helps educators better understand why fewer girls 
than boys choose to study mathematics in the high school can be valuable 
in alleviating this problem. Although the results of this study suggest 
that teachers' use of evaluative feedback in junior high school mathe- 
matics classes may not be a reason for underparticipation by females, as 
the investigators point out, this finding in itself is valuable. 

Although the research was quite well designed and reported, several 
questions must be asked about the study: 

1. One of the criteria for a class to be included in the sample 
was voluntary agreement by the teacher to participate. Are such classes 
representative of the larger population? 

2. The student questionnaire was administered only to students in 
12 of the 15 classes in the sample who volunteered and who had parental 
permission. Were these students representative of the larger population? 
From data reported in the study the participation rate of girls on the 
student questionnaire was 12% highe^ .han for boys in grade 7 and 16% 
higher in grade 9. What are tb^ ^plications of such a different rate 

of participation? 

3. Details of t v _ ,cudent questionnaire are scanty. Only six items 
were used in tK current analysis, these consisting of two subscales. 

. uoz stated how many items were on each subscale. One subscale is 

referred to as having novel or less familiar items at one time, and 
later as having items referring to success in later mathematics courses. 
These concepts need more explanation before valid judgments can be made 
about this instrument. 

4. In the primary analysis the classroom was used as the unit of 
analysis. The researchers are to be commended for using this unit of 
analysis; however, they should have been content with this decision and 
not continue to do the analysis using students as the unit of analysis. 
What would have been their conclusions if the second analysis had indi- 
cated significant differences, when clearly the proper analysis revealed 
no such differences? The motives of the researchers become suspect when 
such inappropriate methods are used. 
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5. There was some indication in the rationale for the study that 
the investigators may have expected to find differences in teachers 1 use 
of evaluative feedback with girls and with boys. Yet when they found no 
such differences, they suggested in their discussions that this was not 
surprising since differences tend not to emerge until late junior high. 
One wonders why, if this was the case, they chose seventh graders for 
their sample rather than eighth or even tenth graders. Perhaps a similar 
study should be undertaken with students in these grade levels. 

In spite of these criticisms, the study is a worthwhile contribution 
to the literature in this area. As the investigators indicate, more 
research is necessary before we can be sure if and where differences 
exist in teachers 1 use of evaluative feedback. 
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Hiebert, James. THE POSITION OF THE UNKNOWN SET AND CHILDREN'S SOLUTION 

OF VERBAL ARITHMETIC PROBLEMS. Journal for Research in Mathematics Education 

13: 341-349; November 1982. 



Abstract and comments prepared for I.M.E. by J. DALE BURNETT, Queen's 
University, Kingston, Ontario. 

1. Purpose 

Addition and subtraction problems were verbally presented to first- 
grade students in order to examine the relationships between "position of 
the unknown set M and the child's (1) method of representation and (2) 
strategy for obtaining a solution. 

2. Rationale 

The article quotes two studios from 1972 and 1973 which looked at 
students 1 solutions to number sentences and concluded that the position 
of the unknown in a number sentence affected the level of difficulty of 
the problem. One other study, a 1981 acticle which was co-authored by 
Professor Hiebert, is cited which indicates that, given the opportunity, 
many young st idents will represent such problems with small cubes and then 
manipulate the cubes to arrive at the correct answer. 

3. Research Design and Procedures 

Sample: 3 first-grade classrooms, n « 47. All students receiving 

parental permission were included. 
Setting: March. The students had not received any previous formal 

instruction in solving verbal problems or in using concrete 

objects to represent problem situations. 
Task: an interviewer read 6 problems to each student; 3 involving 

joining (addition) and 3 involving separating (subtraction). 

The order of presentation was randomized for each student. 

A set of small cubes was available. Factors such as syntactic 

complexity and number size were similar across all of the 

problems. Examples of the verbal problems and the associated 
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number sentence are provided in the article. Thus, a joining 
problem where the posi^on of the unknown set in the associated 
number sentence is that of the second addend is: "Bill had 
3 marbles. Susan gave him some more marbles. Now he has 8 
marbles altogether. How many marbles did Susan give to Bill?", 
which is paired, at least in the investigator's mind, with 
the number sentence 3 +Q ■ 8. 
Summary: A total of 47 first-grade students were each given 6 verbal 

arithmetic problems resulting in a total of 282 protocols. 
Analysis: The analysis consisted of three phases. First, each protocol 

was examined and identified as exemplifying a particular strategy. 
From this stage a total of 9 "appropriate" strategies and 3 
"inappropriate" strategies were identified. 

These 12 strategies were then cross-classified with the 6 
problem types and with whether or not the student used cubes to 
model the situation to provide a 6 x 2 x 12 table of student 
responses. Essentially the table consists of 282 classified 
protocols fitted into a structure with 144 cells. 

Two additional columns were added to the table, one containing 
the simple sum of all of the "appropriate" strategies for that 
problem type and a second which indicated how many of these 
strategies resulted in the correct answer. 

The final phase of the analysis consisted of examining this 
table and computing a few sub-totals and their corresponding 
percentages . 

4. Findings 

This study addressed two principal issues. With respect to the method 
of representation used to model the problem (i.e., whether or not the student 
used cubes) , the author notes that 55% of the responses to problems of the 
form a ± b - Q involved cubes as part of the overall strategy, as compared 
with 40% for problems of the form a ± □ - c and only 18% for problems 
like □ ± b - c. 
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The second issue, dealing with the cognitive strategies used by the 
students to solve the problems, noted dominant strategies for problems of 
the form a ± b - Q and a - □ ■ c, but a variety of approaches for 
a + Q * c and D ± * c types, 

A review of the tabulations of the correct answers shows that most 
students (88% or 71%, depending on whether they used cubes or not) could 
solve problems of the form a ± b «□ , whereas 50% or 33% could solve 
D + b - c and 22% or 37% could solve O - b - c. The author also makes 
special note that only 39% and 28% could solve a + c whereas 80% and 
37% could solve a - D ■ c types. There was a higher percentage of success 
in five of the six problem types for the group using cubes than for the 
group not using cubes. 

The findings just noted are discussed on two levels. At a level closely 
related to the empirical context, the following conclusions are noted: 

1. "••••the position of the unknown had a substantial effect on 
children's modeling behavior" (p. 345) (i.e., cubes or no cubes). 
This is based on the differential percentages (53%, 40%, 18%) 

of students using cubes across the three main problem types. 

2. "•••the strategies used to solve the problems matched the action 

or relationships described in the problem" (p. 345). This is based 
on a careful comparison of strategy types for each problem type. 

3. "... .problems with the unknown in the first position not only are 
more difficult to model but also are more difficult to solve" 

(p. 345). This is based in part on the data used to support the 
first conclusion and in part on the lower level of success for the 
"first position" problems. 

4. The major finding of the study is nicely summarized by the statement, 
"•••the relative difficulty of a problem in this study seemed to 
depend on whether or not it was initially modeled with objects, 
which in turn depended on the position of the unknown set" (p. 347). 
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5. Interpretation 

The second level of interpretation revolves around a brief discussion 
of two theoretical models, one by Skemp and one by Riley, Greeno, and 
Heller, which suggest "that arithmetic problems that are amenable to 
direct representation may be easier to solve than those that are not" 
(p. 348). "The results of this study indicate that the position of the 
unknown set in a verbal problem determines to a substantial degree 
whether or not the problem can be modeled successfully by first-grade 
children" (p. 348). 

Abstractor's Comments 

I would like to organise my comments into three groups. First, 
I would like to summarize those aspects of the study that I lik^d. 
Second, I would like to play around with the data a little to shew 
why I have some reservations about the conclusions. Finally, I would 
like to append a brief Introspection of the review process itself. 

There is much about this study that appeals to me. The main 
emphasis is one I endorse: it focuses on the actual processes used 
by children while attempting to solve a particular class of problems. 
Krutetskii (1976) has expressed his preference for this type of research 
strategy both forcefully and eloquently: 

It is hard to understand how theory or practice can be enriched 
by ... who computed, for 130 mathematically gifted adolescents, 
their scores on different kinds of tests and studied the 
correlation between them, finding that in some cases it 
was significant and in others not. The process of solution 
did not interest the investigator. But what rich material 
could be provided by a study of the process of mathematical 
thinking in 130 mathematically able adolescents! (p. 14). 

Given the concern for process, the study is well-designed and carried 
out. I must also admit that I have a soft spot for researchers who do 
their best to let the data speak to them, and to help them improve their 
understanding, rather than simply to fit the data to confirm (or refute) 
a well-defined hypothesis. Perhaps my preference is based on my belief 
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that the former group is faced with a more difficult task — less structured 

and more open to methodological criticism — but more interesting and, 

in the long run, I believe more fruitful. Parenthetically, I view 

both approaches as lying on a continuum, the exploratory mode eventually 

giving rise to the confirmatory mode. My suspicion is that we too often 

believe ourselves to be in a confirmatory situation when in fact an 

exploratory attitude would be more appropriate. 

Having indicated my preference for process-oriented studies, this 
report is quite disappointing in one major sense. The report fails to 
give the reader any real understanding of the processes used by the 
students. All we are given is a generalized description of each strategy 
and a frequency table that indicates how many students used it in a 
particular situation. I suspect (perhaps wrongly) that there is a 
richness in the protocols that has not yet been captured. For example, 
three types of information are not reported: 1) the student's use of 
language; 2) timing information — where are the pauses, the quick 
bursts of activity in the efforts to solve the problem — ; and 3) the 
student's written work (if any). I suspect that the topic of protocol 
analysis itself is likely to undergo substantial development in the next 
decade. Studies like the present one give us an opportunity to begin 
this development. I would like to encourage the author to take one more 
step along this path. 

Now for a comment on the table of data that is presented in the 
article. As noted in the abstract, the table is essentially a compilation 
of 282 events into a tabular structure with 144 cells. Because many 
strategies are used more than once in a particular context, the resulting 
table contains 62 non-empty cells and 82 empty cells. The two non- 
strategies (uncodable and indeterminate) account for 16 cells and 61 
events (22% of the data). Simply stated, we do not have a lot of data 
here upon which to base conclusions. Rather than examine the complete 
table for noteworthy features, I created a number of sub- tables, a few 
of which prompt further comment. 



42 



34 



For example, restricting attention for the moment to the strategy 
labeled "uncodable" (for me, an interesting situation deserving of 
further research), it is an easy matter to construct two 2x3 tables 
(addition or subtraction by position of unknown), one for students who 
used cubes and the other for those who did not. 

Used Position of Unknown Did not Position of Unknown 

Cubes 12 3 Use Cubes 12 3 



Addn. 0 0 1 Addn. 6 4 1 

Subtr. 0 0 1 Subtr. 2 4 2 

Clearly, most uncodable strategies were used by those students 
who did not model the problem using cubes (and, reasonably, gave the 
researchers very little hint as to what they were thinking) . 

Errors are also interesting, and often informative, when investigating 
children's mathematical behavior. Three error strategies were identified 
in this study. Easily the most common error, and in fact the most common 
strategy found in the study, was that of saying that the answer is one 
of the two numbers given in the statement of the problem. Composing two 
tables as before for this strategy yields: 

Used Position of Unknown Did not Position of Unknown 

Cubes 12 3 Use Cubes 12 3 



Addn. 16 0 Addn. 16 15 4 

Subtr. 3 10 Subtr. 11 10 4 

Three comments: 1) a total of 71 responses out of 282, or 25% 
of the sample of responses, were of this type; 2) the number of errors 
of this type increased as the position of the unknown moved from right 
to left; and 3) this type of error is much more prevalent among students 
who did not use the cubes than among those who did. A similar pattern, 
but with half the strength, is observed if you construct two tables for 
the "interminate 11 error strategy. 
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In summary, a review of the appropriate sub- tables for the various 
error and unknown strategies indicates 1) that students who use cubes to 
model the problem have fewer difficulties than those who fail to model the 
problems; and 2) the errors are much more frequent for problems of the type 
Q ± b « c and a ± □ » c than for a ± b -Q. 

The latter conclusion prompts two additional comments: 1) I somehow 
doubt that any classroom teacher would be surprised by this finding; and 
2) the ordering of difficulty of the problems also corresponds to the 
linguistic complexity of the problems. Thus, problems of the form a ± b - Q 
require two factual statements followed by the question; forms a ± □ ■ c 
and 0 ± b - c all require three factual statements for the student to keep 
track of before the question is presented. 

Three of the successful strategies are strongly identified with one 
specific problem type and with the fact that they all used cubes to model 
the situation. Thus, "counting all 11 is used almost always for problems of 
the type a + b = Q ; "separate" £"the larger quantity is represented, and 
the smaller quantity is removed from it. The remaining objects are counted 
to find the answer" (p. 344)]} and "separate to" ["the larger quantity is 
represented, and objects are removed until the smaller quantity remains. 
The removed objects are counted to find the answer" (p. 344^} . The other 
common successful strategy was that "based on recall of that particular 
number fact" which was used to those students who did not use cubes. And 
now a concern emerges. Most of the strategies used in this study essentially 
require (or fail to require) that cubes be used (or not) as an integral part 
of their very definition (i.e., if you use the "counting all" strategy you 
are virtually required to use cubes. Similarly, if you know the required 
number facts, why would you use the cubes?) The remaining components of the 
master table are sufficiently sparsely populated as to question the value of 
including them in the overall analysis. 

It is now instructive to compare the conclusions emanating from the 
above treatment with those of the author. It is worth emphasizing that the 
data base is the same — frequencies of strategies under particular conditions; 
the difference is in the selective focus on specific subsets of the data. 
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The author claims that "... the relative difficulty of a problem 
seemed to depend on whether or not it was initially modelled with objects, 
which in turn depended on the position of the unknown set" (p. 347). 
The data support this conclusion. They also support other interpretations. 
For example, whether or not a problem was initially modelled with objects 
seems to depend on whether or not the student knew the relative number 
facts, or whether or not the problem is actually understood (i.e., would 
a student who really understood the problem ever use the strategy of 
"repeating a given number"?). The relative difficulty of the problem 
is also dependent on the position of the unknown set. Although the author 
did not make this mistake, it still deserves emphasizing that the nature 
of the data is descriptive and correlative — it is possible to make 
statements of a relational nature (e.g., as the position of the unknown 
moves from right to left, the problems become more difficult), but it is 
not appropriate to make statements of a causative nature. We still do 
not know why this particular effect is observed. 

One final thought: Is the idea of "position of unknown" an adult 
surface structure feature masking a deeper semantic level structure for 
the child? It is easy enough to compare the two algebraic representations 
a + b = 0 and a + □ =c and see them as simply differing in the location 
of the unknown. However, these representations are not available to 
first-grade students. Their task is to make sense of a string of short 
verbally presented sentences requiring them to use both memory and logic 
(who has what and what is unknown). These problems are verbally, semantically , 
and logically more complex when the unknown is on the left side of the 
algebraic equation. 

As an aside, I would like to thank Professor Hiebert for conducting 
the original study and Marilyn Suydam, the editor of IME, for inviting 
me to prepare this abstract and comment. The exercise has been a personal 
joy. I rarely read articles with the precision that this task required 
(a sad admission). Writing a succinct abstract of a research study is 
useful in any context. I must do more of it (independent of IME!). 
Most of the comments were a spontaneous outgrowth of preparing the abstract — 
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a dominoe effect* How much more rewarding this has been than ray more 
typical skim reading which would simply note that some kids use cubes 
and some do not when solving problems of this type and that the 
position of the unknown affects the level of difficulty of the problem. 

Reference 
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Janvier, Claude. USE OF SITUATIONS IN MATHEMATICS EDUCATION. Educa tional 
Stu dies in Mat hematics 12: 113-122; February 1981. 



Abstract and comments prepared for I.M.E. by WILLIAM E. GEESLIN, 
University of New Hampshire. 

1 . Purpo se 

The author examines the effect of a situation on children's cognitive 
behavior. Situation refers to the physical and verbal environment (context) 
in which a question or problem is posed. 

2. Rationale 

Educators encourage relevant or meaningful problem situations. Like- 
wise, many psychologists feel the context in which instruction or problems 
are presented is a determinant of children's mental structures. Consequently, 
a wide range of manipulatives, physical materials, and "realistic" problems 
have been developed for classroom use. Interest in this area has come from 
the writings of Piaget, Bruner, Dienes, Fischbein, and others. 

3 . Resea r ch Design and Procedures 

The sample apparently consisted of 40 first-year pupils in an English 
secondary -lev el school* However, an additional 20 pupils from the first- 
(n = 7), second- (n = 7), and fourth-year (n = 6) secondary level were inter- 
viewed also* Subjects were presented with a graph which gave the speed of 
a race car during the second lap around a race track. Students were asked 
about the number of bends in the race track and later asked to select the 
corresponding track from seven alternatives. No information was given con- 
cerning selection of the sample. The author stated that the 40 first-year 
students received the task in written form, but all discussion seems to 
refer to interview data. No statistical analysis of data was presented. 

4. Findings 

The main difficulty students had in determining the number of bends in 
the track was confusing the speed graph with the track. In selecting the 
track that matched the speed graph, many students were unable to get rela- 
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tive perceptions of the track bends. Sex differences were noticable on sev- 
eral aspects of the task. 

5. I nterpretations 

Familiarity with racing cars helped some boys to complete the first part 
of the task. Girls, on the other hand, could count on little "situational 11 
support. However, in selecting the correct track, familiarity with the sit- 
uation appeared to inhibit the necessary abstraction process. Mental images 
conflicted with the basic abstract aspects of the problem. Too much infor- 
mation, i.e., knowledge of racing cars, made the task more difficult. 

Wide individual differences were noted. Situations should be used pri- 
marily to assist students in developing their ability to abstract. The 
author recommended the use of large-scale situations (i.e., involvement over 
a long time period) that stress the child's point of view rather than mathe- 
matical structure. Students need "verbal tags 11 if they are to deal success- 
fully with abstract concepts. Most importantly, the use of situations does 
not necessarily make learning easier and is not the panacea for transforming 
abstract ideas into "concrete" representations. 



The major contribution of this article is the idea that educators may 
make mathematics more difficult in their efforts to help children learn. 
Introduction of physical materials, manipulatives, motivating situations, 
and realistic problems may confuse the student rather than clarify the mathe- 
matics. Research that attempts to investigate accepted "truths" is impor- 
tant and often quite revealing. 

Unfortunately, this study is reported poorly and thus does little to 
answer the questions raised. The results are confounded with sex, spatial 
ability, knowledge of graphs, and type of problem presentation. The author 
chooses to disregard some results (e.g., sex differences) while emphasizing 
others (e.g., errors on the task) and does not explain his choices. In fact, 
little data are presented, leaving the reader unable to judge the validity 
of the many inferences made. It was stated that the task was administered 
in written form to first-year students, yet the discussion of methodology 
and results indicated an interview technique was used. Discussion includes 
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results from non-first-year students even though the author earlier stated 
that only results of the written (and thus first-year students 1 ) task pres- 
entation would be discussed. Therefore neither sample, data, nor procedures 
are presented clearly. Editors and referees should not allow this much con- 
fusion in reporting and should assist the author in locating points of con- 
fusion. 

Note that the above criticisms are not criticisms of interview method- 
ology (or case studies), but rather are criticisms of the reporting. Inter- 
views are a valuable methodological tool. However, they require controls, 
planning, and detailed reporting just as the traditional large-scale statis- 
tical studies do. 

It is my hope that the ideas of Janvier will not be disregarded. The 
reader will not find much solace in the article as is. However, many inter- 
esting questions arise: Was the task used by Janvier a "mathematical" one? 
Is spatial ability related to abstracting ideas from a graph? Does the use 
of even a "good" manipulative or problem context confuse the learner? or 
change the task for some individuals? How does one select appropriate mental 
processes/retrieve appropriate information when faced with a task? Does the 
use of "large situations" (such as USMES problems) promote the learning of 
either problem solving or mathematical concepts? Some of these questions 
clearly require numerous investigations if we are to obtain answers to them. 
Janvier's article provides us with some hints as to how we might pursue these 
questions. 
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Levine, Deborah R. STRATEGY USE AND ESTIMATION ABILITY OF COLLEGE 
STUDENTS. Journal for Research in Mathematics Education 13: 350-359; 
November 1982~! 



Abstract and comments prepared for I.M.E. by OTTO BASSLER, George 
Peabody College for Teachers of Vanderbilt University, 

1 . Purpose 

The study investigated the question, "Among college students, what 
are the relations among computational estimation ability, the number 
and types of estimation strategies used, and quantitative ability? 11 
(p. 351). 

2. Rational e 

Estimation skills are useful in daily living and have been recommended 
as necessary basic skills by the National Council of Teachers of 
Mathematics and the National Council of Supervisors of Mathematics. 
Little research has focused on adult estimation skills or the strategies 
that adults use to estimate. 

3. Research Design and Procedures 

The sample consisted of 89 college students who volunteered to 
participate. Descriptive information about the subjects indicated 
sex (34 men, 55 women) and previous mathematics courses (a mean of 3.0 
years of high school mathematics; 53% had not completed a college- 
credit mathematics course; no mathematics majors were included). 

Instruments used in the study were (a) Test of Estimation Ability 
(TEA) and (b) School and College Ability Test (SCAT) quantitative subtest. 
The TEA is an investigator-constructed test consisting of ten multiplica- 
tion and ten division exercises. Items contained two whole numbers, a 
whole number and a decimal fraction, or two decimal fractions. Directions 
to subjects were to "think aloud" to obtain an oral estimate to the 
solution of the exercise. This provides a score on each item determined 
by the accuracy of the estimate as well as a determination of the 
strategy used to obtain the estimate. Reliability of scores was .80 
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and percent of agreement by two raters on strategies was 90.4. The 
SCAT quantitative subtest, used to measure ability, had a reported 
reliability of .89. 

Eight strategy classifications were developed based upon pilot data, 
estimation literature, and a logical analysis of the test exercises. 
Classifications were: 

1. Fractions (F) — use of common fractional relationships. 

2. Exponents (Exp) — a form of scientific notation, 

3. Rounding Both Numbers (R2) — both numbers estimated by a 
multiple of a power of 10. 

4. Rounding One Number (Rl) — only one number is rounded. 

5. Powers of Ten (Pow) — rounding to a power of 10. 

6. Known Numbers (K) — rounding to numbers having a known product 
or quotient. 

7. Incomplete Partial Products (Quotients) (IP). 

8. Proceeding Algorithmically (Alg) . 

Examples for each classification were provided. 

Testing time for each subject was approximately one hour. First 
the TEA was administered by presenting the items in random order. 
Subjects thought aloud as they obtained estimates without using pencil 
or paper. To clarify the strategy that was used, the investigator 
asked probing questions. This portion of the testing session was tape- 
recorded for later scoring. Next, subjects were asked to provide a 
brief educational background by completing a short questionnaire. 
The SCAT was completed last and took 20 minutes. 

4. Findings 

1) The correlation coefficient between scores on the SCAT (quantitative 
ability) and TEA (estimation ability) was .74. 

2) Analysis of variance followed by the Scheffe procedure was conducted 
to test for differences in the frequencies of use of the eight strategy 
types. The results indicated Exp, IP, Pow, and K were used least 
frequently; F and Rl were used more frequently; and R2 and Alg were used 
most frequently. 
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3) Analysis of covariance, where the covariate was quantitative 
ability, indicated no systematic relationship between accuracy of 
estimate and strategy used when scores on individual test items were 
analyzed. 

4) A one-way ANOVA was used to compare quantitative ability of 
students using different strategies for each item. Results were 
significant for 12 of the 20 items and indicated consistently that 
students using Alg were of lower ability than students using one of the 
types Rl, F, K, or Pow. 

5) The correlation coefficient between number of strategies used 
and scores on the SCAT was .55. 

6) No significant relationship was found between number of estimation 
strategies and score on the TEA when quantitative ability was partialled 
out. 

7) The mean score on the TEA, which has a maximum score of 60, 
was 25.9. 

5. Interpretations 

1) Scores on the TEA were generally low, which suggests that 
estimation is difficult for college students. 

2) Estimation ability is closely related to quantitative ability. 
In general, high-ability students are better estimators and use more 
strategies than low-ability students. Also, lower-ability students tend 
to use the strategy Alg. 

3) The most frequently used strategy types were R2 and Alg. Use of 
Alg may stem from a dependence on exact paper-and-pencil calculations, 
whereas R2 may be the technique most often taught as a method of 
estimation. 

4) Student estimates when quantitative ability was statistically 
controlled seemed to have similar accuracy when different strategies 
were used. 
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Abstractor's Comments 

This study describes and analyzes the estimation strategies of a 
particular group of college students. Whether these results can be 
generalized to a broader population is debatable — especially since 
all students were volunteers from a single New York City college and 
no descriptive statistics pertaining to ability as measured by the 
SCAT were provided. Another factor which might influence the results 
is the low level of achievement on the Test of Estimation Ability. 

One surprising finding was that, when ability was controlled, there 
was no relation between estimation strategy used and accuracy of 
estimate. This may be due to the particular items on the test, or perhaps 
to poor application of the strategy. In any case, the strategies do 
produce different accuracies. For example, for the test item 824 x 26, 
using F (824 x 1/4 x 100) yields an estimate 4% too small, and a test 
score of 3; whereas, using B.2 (800 x 30) yields an estimate 12% too 
small and a test score of 2. Other methods of estimation may yield 
more diverse results. This finding needs further investigation. 

The study did provide a useful way of assessing estimation abilities 
and strategies. It provides support for the view that college students 
are poor estimators. Why would we expect different results when "they 
(the students) reported having been taught little if anything about 
estimating?" (p. 357). It is interesting to note, however, that 
despite this lack of instruction, individual students used an average 
of over four estimation strategies in completing the TEA. It seems 
to me that this study should provide the impetus for additional research 
and emphasis on teaching estimation strategies. 
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Szetela, Walter. STORY PROBLEM SOLVING IN ELEMENTARY SCHOOL MATHEMATICS: 
WHAT DIFFERENCES DO CALCULATORS MAKE? Journal for Research in Mathema- 
tics Education 13: 381-389; November 1982. 

Abstract and comments prepared for I.M.E. by STANLEY H. ERLWANGER, 
Concordia University, Montreal, Quebec. 

1 . Purpo se 

Main Study: To determine if students who use calculators in story 
problems tend to try more problems, to use more correct operations, and 
to obtain more correct answers than students who use paper and pencil 
only. Supplementary Study: To compare the use of calculators on a post- 
test of problem-solving with the use of paper and pencil only, after all 
groups had used calculators during 8 weeks of instruction. 

^ • Rat ionale 

First, in many studies, students who used calculators during instruc 
tion used only paper and pencil in achievement post tests. Roberts (1980) 
has criticized this tendency and suggests calculators should be used 
instead. Second, Szetela (1980a, 1980b, 1981) has shown that the calcu- 
lator is a critical factor in solving story problems. In all three 
studies, students who used calculators performed better than those who 
did not. There were no significant differences when paper and pencil 
was used. Third, Wheatley and his colleagues have suggested that when 
calculators are used students can "focus on choosing the correct opera- 
tions, determining the reasonableness of their answers, and further, a 
broader range of strategies is possible" (p. 21). Fourth, although there 
are hypotheses concerning superior problem-solving performance when stu- 
dents use calculators, there is little evidence to support them. 

3. Research Design and Procedures 

The investigation consisted of a main study and a supplementary 
study conducted simultaneously. 

Sub ject s; Two classes in each of grade 3 (n - 50) , grade 5 (n ■ 36) , 
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grade 7 (n - 49), and grade 8 (n - 52) participated. Each grade was from 
a different school. Students in grades 3, 5, and 7 were randomly assigned 
to the Calculator group (C) or the Non-Calculator group (N) • Grade 8 
was partially randomized because of scheduling problems. One teacher in 
each grade taught both the C group and the N group. 

I nstruments ; One pretest and two posttests were given in each grade. 
Pretest : A 40-item pretest on numerical skills developed by 
Robitaille and colleagues (1979) was used for grades 3, 7, and 8. 
A similar test was constructed for grade 5, 

Posttests : In the first posttest, consisting of 16 items on compu- 
tational skills and 10 problems, only paper and pencil was allowed. 
In the second posttest, consisting of 20 problems, the C group used 
calculators. 

P rocedures : After the pretest, regular instructional activities were 
followed by the N groups in grades 3, 5, and 7 for 8 weeks, and those 
in grade 8 for 12 weeks, since they started 4 weeks earlier. The topics 
for each grade were: 

Grade 3: Whole number operations in multiplication, basic division 
facts, and problem-solving applications. 

Grade 5: Introduction to decimals, operations with decimals, and 
problem-solving applications. 

Grade 7, 8: Decimals, ratios, percent s, and problem-solving. 

The C groups followed similar instruction at the same time, except 
that they used calculators and materials designed for calculators. One 
calculator was provided for every two students. In grade 3 the calcu- 
lator was used mainly for problem solving, while in grades 5, 7, and 8 
it was used for other activities as well as problem solving. 
Data Analysis . Posttest data were analysed using analysis of covariance 
with pretest scores as a covariate. Three measures were analyzed for 
the problem-solving tests on the number of problems (i) attempted, (ii) 
with all operations correct, and (iii) with correct answers. 

Supplementary Study 
Subjects: Seventy-six grade 7 students from three classes (two with the 
same teacher), 23 grade 6 students in one class, and 25 students in a 
split grade 5/6 class from two schools. 
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Procedures ; Procedures were as in the main study, but calculators were 
were provided in each class, with one calculator for every two students. 
Posttesting Treatment : Students were posttested on the 20 problems in 
a Calculator Testing Mode (CTM) and in a Paper-and -Pencil Mode (PPM) 
during one 50-minute session. For the first section of 10 problems, half 
of the students in each class were randomly assigned to CTM and the other 
half to PPM. For the second section of 10 problems, the groups reversed 
testing modes. 

Data Analysis : Data were analyzed by analysis of covariance with pretest 
scores as a covariate. Measures were taken for the number of problems 
(i) attempted, (ii) with all correct operations shown, and (iii) with cor- 
rect answers. 

4. Results 

Main Study : 

(a) The pretest results, with Hoyt estimates of reliability 
ranging from 0.81 to 0.90, show that for all the comparison groups no 
pretest means were significantly different. 

(b) First posttest (paper and pencil only - C and N groups) 
The means, standard deviations, and F Ratios were calculated 

for the four measures: Skills (160), Problem attempts (10), Correct 
operation (10), and Correct answers (10). The Hoyt estimates of reli- 
ability for the computation test ranged from 0.63 to 0.87. Only the 
grade 3 results were significant, in favor of the C group on three 
measures - Skills, Correct operation, and Correct answer. All other 
differences were nonsignificant. The author states: "Overall, it is 
evident that the use of calculators over periods from 8 to 12 weeks did 
not diminish skill in paper-and-pencil computation and problem-solving 
ability ..." (p, 384). 

(c) Second Posttest (C group with calculators) 

The means, standard deviations and F Ratios were calculated 
for three measures on the 20 problems: Problem attempts, Correct oper- 
ation^and Correct answer. 

The results show significant differences in favor of the C 
groups as follows: 
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Grade 3 (Correct answer); grade 7 (Problem attempts, Correct 
operation, Correct answer); grade 8 (Correct answer). All other dif- 
ferences were nonsignificant. 
S upplementary Study ; 

(a) The results of the pretests, with Hoyt estimates of reli- 
ability ranging from 0.58 to 0.91, show that the pretest means for com- 
parison groups were not significantly different. 

(b) The posttest results show that the groups using calcula- 
tors performed significantly better than the paper -and-pencil groups on 
four out of 8 comparisons for Correct answers. The other four differ- 
ences were nonsignificant, except on Correct operations in grade 6. 

5. Int erpr e tat ions 

The author points out three findings from the two studies: 

• The considerable degree of consistency of results in problem- 
solving performance with and without calculators that was found in 
the two studies. Also, the differences found were in favor of 
students using calculators. 

• The results are consistent with other research that indicates that 
no loss of paper -and-pencil skills occur after using calculators 
during instruction. 

• The results indicate the advantage that calculators provide in 
obtaining correct answers, but not in the number of problems at- 
tempted or in the number of problems in which students choose the 
correct operat ions . 

The author observes further that, according to the evidence pre- 
sented by Zweng (1979) from the National Assessment of Educational 
Progress, it appears "that it is the inability of students to choose 
the correct operations rather than computational weaknesses that con- 
tributes most to their inability to solve problems'* (p. 387). The study 
indicates that calculators helped students to compute correctly, but not 
to attempt more problems or to choose correct operations any better than 
students without calculators. 

The author concludes that aiLnou & n the study shows that the benefit 
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of using calculators with story problems is limited to avoiding computa- 
tional errors, this is an advantage - especially since calculators are 
accessible and inexpensive. 



The results from the study have implications for the teaching of 
mathematics. It is disappointing to find out that the main role of a 
calculator in solving problems is that of a computational aid. The study 
should be replicated. 

The research itself was well planned. The main study and supple- 
mentary study complement one another. The design and method of analysis 
were adequate aside from three limitations: one calculator for two stu- 
dents, the random assignment of grade 8 students, and two teachers for 
three grade 7 classes. My main concern is that the style of reporting 
is too concise and factual, so that it is difficult to see what assump- 
tions were made. The questions and comments below illustrate some of 
those aspects I feel would have enhanced the value of this research. 

1. The study attempted to find out whether the use of calculators makes 
a difference in solving story problems on three measures: number 

of problems attempted, correct operations, and correct answers. The 
report should thus provide enough details about how these aspects 
were taken into account. 

2. Instruction : The instruction of the calculator and non-calculator 
groups in each grade is an important aspect of the study. The 
report says both groups received similar instructio n which appears 
to consist of "problem-solving activities" and "other activities". 
How were these two aspects organized and controlled to ensure that 
both groups received the same treatment? Did students know how to 
use calculators prior to the study? Moreover, since the study was 
concerned primarily with problem-solving, would it not be better 

to confine the use of calculators to this aspect of the instruction 
only? 

3. Problem-Solving Activities : This is a critical aspect of the 
whole study, but we know very little about it. The report does not 
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describe the approaches used by the teachers to teach problem- 
solving and we do not know whether or not the problem-solving 
activities were related to the purpose of the study. A more serious 
issue is: What were the criteria for "use of paper and pencil 11 and 
"use of calculator 11 in solving story problems and how is this re- 
lated to the purpose of the study? In what way was the calculator 
intended to make a difference? Did the use of calculators influence 
what was done? It is not clear why one calculator was provided for 
two students. Was this intentional or what? 
4. The Posttests : The report gives a sample problem for each grade. 
This is useful. However, the posttests were used to compare the 
performance of the comparision groups on three measures. One would 
therefore want to know the criteria used for selecting the problems, 
constructing the tests, and determining each measure. Was the 
selection of problems based on number of operations, complexity of 
computation, problem structure, etc.? Was time a factor in the 
test? What instructions were given to students? 

I do appreciate the author's difficulty in that one is expected to 
produce a fairly short report for publication. However, a factual re- 
port seems to be inappropriate for a study which deals with aspects of 
problem-solving behavior. In my view, the value of this study would be 
greatly enhanced by detailed descriptions and explanations of assump- 
tions and procedures, which include a combination of quantitative and 
qualitative data. For example, a description of how teachers usually 
taught problem solving with and without calculators would have been 
useful. Similarly, a description of observations of how individual 
pupils in the comparison groups solved problems would help the reader 
to evaluate the results. 
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Threadgill-Sowder, Judith and Sowder, Larry. DRAWN VERSUS VERBAL 
FORMATS FOR MATHEMATICAL STORY PROBLEMS. Journal for Research in 
Mathematics Education 13: 324-331; November 1982. 

Abstract and comments prepared for I.M.E. by MICHAEL T. BATTISTA, 
Kent State University. 

1. Purpose 

This study investigated the effect on problem-solving performance 
of two formats for presenting routine mathematical story problems: 
verbal (as typically encountered in textbooks) ard drawn (line drawings 
with minimal verbiage depicting a problem situation). 

2. Rationale 

Several studies have indicated that when students 1 performance on 
story problems is compared, the picture or diagram format is superior 
to the verbal format. But these studies have been restricted to test 
situations only. The authors suggest that the results may have been 
due to the fact that the picture/diagram format was novel to the 
students. Thus, the present study included practice with both verbal 
and drawn format problems before testing. 

The study also investigated the effect of field independence and 
spatial visualization on performance in solving verbal and drawn format 
problems. In the case of field independence, the authors hypothesized 
that "The prominence of the essential information allowed by a drawing 
of a problem could negate any handicap a field dependent student might 
have with the problem in verbal format" (p. 325). Spatial visualiza- 
tion was included as a variable because it has been hypothesized in the 
literature to be related to visual encoding and flexibility in transforming 
data. 

3. Research Design and Procedures 

The subjects of the study were 262 students from ten participating 
fifth-grade classes in the Calgary, Alberta public school system. 
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Field independence was measured by the Hidden Figures Test, spatial 
visualization by the Punched Holes Test, and general reasoning by the 
Arithmetic Reasoning Test. All three measures were NLSMA adaptations 
of the French Kit of Cognitive Factors. A set of problems appropriate 
for fifth-grade students and requiring all four arithmetic operations 
was written by the authors in the usual verbal format. Each of these 
problems was also constructed in the drawn format. 

There were two equivalent forms of the 16-item posttest. Each form 
presented the problems in the same order, alternating between verbal 
and drawn formats. Problems in the verbal format on the first form 
appeared in drawn format on the second form, and vice versa. The computa- 
taional difficulty of the 8 verbal versus the 8 drawn problems on each 
test was controlled by requiring that each problem in the verbal format 
be paired with a problem in drawn format that required similar computa- 
tional skill to solve. Each item on the posttest was scored for correct 
arithmetic operation and correct solution. Each student was assigned 
a Drawing Score, Verbal Score, and Total Score for the posttest. 

After t'ue three aptitude measures were given to the students, 
five classes were randomly assigned to both the verbal and drawing 
treatment groups. During the five-to-six-week treatment period, students 
in the verbal group were given four sets of 8 verbal problems for 
practice, and students in the drawing group were given four sets of 
8 drawn problems. The two forms of the posttest were then randomly 
administered to the students. 

4. Findings 

The Drawing Scores (X = 11.87, s.d. = 3.67) were significantly, 
but not substantially, higher than the Verbal Scores (X = 11.21, 
s.d. - 3.79). In order to test for a difference in problem-solving 
performance between the verbal and drawing treatment groups, ANOVAs 
were run, first using classes as the unit of analysis (no significant 
differences), then using students as the unit. In the latter case, 
the verbal group scored significantly higher (p < .05) on the Total 
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and Verbal Scores. It was noted, however, that the verbal group also 
scored slightly higher on the Arithmetic Reasoning Test, so the two 
treatment groups could not be assumed to be equivalent. Informal 
interviews of students indicated that most students preferred problems 
in the drawn format. 

Each of the three posttest scores was regressed on the aptitude 
measures to test for aptitude-treatment interactions. (All three 
aptitude measures were positively correlated with the posttest scores.) 
The only indications of ATIs were disordinal interactions between treat- 
ments and the Hidden Figures Test for both the Total Score and the 
Drawing Score (p < .10). For the Drawing Score, the slope of the 
regression line for the verbal treatment group was slightly greater 
than that for the drawing group* 

5. Interp re tations 

The authors state that "Presenting problems by way of drawings 
was clearly more effective than the standard words-only presentation 
for these students. Students interviewed about preference^ indicated 
that the drawings helped clarify problems" (p. 329). Furthermore, the 
authors suggest that the interaction between treatments and field 
independence on the Drawing Score and Total Score indicates that 
practice on drawn format problems may be more helpful than practice on 
verbal format problems for field dependent students, with the reverse 
true for field independent students. They state, "Perhaps practice on 
drawn problems serves to distract the field independent students by 
providing them with unnecessary mediators" (p. 329). 

Abstractor's Comments 

When I first examined the pair of example problems provided in the 
article, one in verbal format, the other in drawn, I thought the drawn 
problem would be more difficult for students. It seemed that students 
would have to analyze the draw'ne more carefully than the verbal 
problem in order to decide what flie problem asked and what was given. 
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The results of the study indicated that, at least for fifth-grade 
students, performance on drawn format problems was somewhat higher than 
on verbal format problems. However, if in fact the drawn format problems 
require more analysis than verbal format problems, we should expect 
that there would be a greater difference in performance between field 
independent and field dependent students on drawn format problems 
than on verbal format problems because the field independent students 
are more likely to utilize analysis (Witkin et al. , 1977). Testing this 
hypothesis would require comparing the regression line for Drawing Scores 
on Hidden Figures Test scores to the line for Verbal Scores on Hidden 

« 

Figures Test scores. This was not done in the present study. 

Instead, the authors chose to focus on the observed (though not 
significant) ATI that suggested that the difference in Drawing Score 
performance between field independent and field dependent students was 
greater for the verbal treatment group than the drawing treatment 
group. The authors hypothesized that this result could have been caused 
by the fact that practice on drawn problems was distracting for field 
independent students (and not for field dependent students.) An 
alternate explanation is that when the verbal group was tested with 
drawn format problems, the novelty of the drawn format required the 
students to do more analysis than they would have done with a familiar 
format. Thus, since field independent students are more likely to use 
analysis, the observed ATI is consistent this hypothesis. 

In addition to field independence, the other variable investigated 
by this study was spatial visualization. Although spatial visualization 
was more highly correlated with the problem-solving scores than field 
independence, not much mention was made of its effect on problem-solving 
performance. Apparently, there was no ATI between spatial visualization 
and treatment on any of the posttest measures. But it would have been 
interesting to see some of the relevant data. For instance, it was 
hypothesisized that one reason spatial visualization was important to 
consider as a factor in solving drawn problems is the* likelihood that 
visual encoding involves some use of spatial relationships. This would 
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seem to imply that spatial visualization would be more important for 
success in solving and practicing drawn format problems than verbal. 
However, with some verbal problems, a key element in solving the problem 
might to visualize or imagine a situation. In this case, spatial 
visualization would seem to be very important in solving the verbal 
format version of the problem, but not so important in solving the 
drawn format version, since in the drawn format the visualizing is 
already done for the student. 

There are several other questions that should be considered when 
interpreting the results of the study: Is there an interaction between 
treatments or testing format and reading ability? For instance, are 
students with low reading ability better able to solve problems in drawn 
format, or is the drawn format treatment more effective for them? 
What is it about the drawn format that makes the problems easier to 
solve? Are drawings simply more interesting to students? The example 
problem given in the article had drawings of human-like characters. 
Did all of the problems used in the study have such characters? Maybe 
students are more attentive to such problems — especially field- 
dependent students (Witkin et al., 1977). 

All in all, I found this to be an interesting study. It raised 
many theoretical questions on which I would like to see further 
research. As for instructional implications, the authors state, "The 
confirmation that with fifth graders a drawn format can give a problem- 
solving performance superior to that of verbal format has clear implications 
for textbook publishers and teachers" (p. 329). Since the difference in 
performance on drawn and verbal presentation format was moderate, and 
since we don't know if the results hold true at other grade levels or 
what the long-range effects would be if too great an emphasis were 
placed on drawn format problems, I would hope that textbook publishers 
and teachers move cautiously in utilizing the present results. For now, 
it seems prudent to say only that practice on drawn format problems is 
an alternate instructional strategy that can be used to help improve 
students 1 problem-solving performance* 
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